Search CORE

3 research outputs found

Secure Decentralized Decisions in Consolidated Hospital Systems: Intelligent Agents and Blockchain

Author: Badré Adrien
Publication venue
Publication date: 01/01/2018
Field of study

Shared decision making has become a very important solution in order to build a consolidated healthcare system. While there is some research in the healthcare literature discussing the advantages and disadvantages of the shared decision making, its efficiency has not been addressed quantitatively. In this thesis, we propose a universal decentralized decision-making architecture utilizing the Blockchain Technology and Machine Learning (predictive and prescriptive analytics) to address the compelling need for coordination among healthcare providers and patients in an efficient and integrated manner. The healthcare process considered is the assignment of a patient to the best physician and hospital in consolidated hospital systems. After designing Decentralized Patients Assignment System (DPAS), the model is simulated using Agent-based models (ABM). The ABM consist of 4 agents including patient, physician, hospital and miner (assignment algorithms) which interact inside a decentralized integrated system. The proposed mechanism introduces the importance of interoperability between healthcare agents in the decision making process created by Blockchain Technology. To illustrate the model efficiency, two scenarios have been simulated and the results are compared. The results demonstrate the proposed model efficiency in terms of the assignment rate, computational time, and cost

SHAREOK repository

Interpretable deep neural networks for more accurate predictive genomics and genome-wide association studies

Author: Badré Adrien
Publication venue
Publication date: 21/04/2023
Field of study

Genome-wide association studies (GWAS) and predictive genomics have become increasingly important in genetics research over the past decade. GWAS involves the analysis of the entire genome of a large group of individuals to identify genetic variants associated with a particular trait or disease. Predictive genomics combines information from multiple genetic variants to predict the polygenic risk score (PRS) of an individual for developing a disease. Machine learning is a branch of artificial intelligence that has revolutionized various fields of study, including computer vision, natural language processing, and robotics. Machine learning focuses on developing algorithms and models that enable computers to learn from data and make predictions or decisions without being explicitly programmed. Deep learning is a subset of machine learning that uses deep neural networks to recognize patterns and relationships. In this dissertation, we first compared various machine learning and statistical models for estimating breast cancer PRS. A deep neural network (DNN) was found to be the most effective, outperforming other techniques such as BLUP, BayesA, and LDpred. In the test cohort with 50% prevalence, the receiver operating characteristic curves area under the curves (ROC AUCs) were 67.4% for DNN, 64.2% for BLUP, 64.5% for BayesA, and 62.4% for LDpred. While BLUP, BayesA, and LDpred generated PRS that followed a normal distribution in the case population, the PRS generated by DNN followed a bimodal distribution. This allowed DNN to achieve a recall of 18.8% at 90% precision in the test cohort, which extrapolates to 65.4% recall at 20% precision in a general population. Interpretation of the DNN model identified significant variants that were previously overlooked by GWAS, highlighting their importance in predicting breast cancer risk. We then developed a linearizing neural network architecture (LINA) that provided first-order and second-order interpretations on both the instance-wise and model-wise levels, addressing the challenge of interpretability in neural networks. LINA outperformed other algorithms in providing accurate and versatile model interpretation, as demonstrated in synthetic datasets and real-world predictive genomics applications, by identifying salient features and feature interactions used for predictions. Finally, it has been observed that many complex diseases are related to each other through common genetic factors, such as pleiotropy or shared etiology. We hypothesized that this genetic overlap can be used to improve the accuracy of polygenic risk scores (PRS) for multiple diseases simultaneously. To test this hypothesis, we propose an interpretable multi-task learning approach based on the LINA architecture. We found that the parallel estimation of PRS for 17 prevalent cancers using a pan-cancer MTL model was generally more accurate than independent estimations for individual cancers using comparable single-task learning models. Similar performance improvements were observed for 60 prevalent non-cancer diseases in a pan-disease MTL model. Interpretation of the MTL models revealed significant genetic correlations between important sets of single nucleotide polymorphisms, suggesting that there is a well-connected network of diseases with a shared genetic basis

SHAREOK repository

Deep neural network improves the estimation of polygenic risk scores for breast cancer

Author: Badré Adrien
Muchero Wellington
Pan Chongle
Reynolds Justin C.
Zhang Li
Publication venue
Publication date: 24/07/2023
Field of study

Polygenic risk scores (PRS) estimate the genetic risk of an individual for a complex disease based on many genetic variants across the whole genome. In this study, we compared a series of computational models for estimation of breast cancer PRS. A deep neural network (DNN) was found to outperform alternative machine learning techniques and established statistical algorithms, including BLUP, BayesA and LDpred. In the test cohort with 50% prevalence, the Area Under the receiver operating characteristic Curve (AUC) were 67.4% for DNN, 64.2% for BLUP, 64.5% for BayesA, and 62.4% for LDpred. BLUP, BayesA, and LPpred all generated PRS that followed a normal distribution in the case population. However, the PRS generated by DNN in the case population followed a bi-modal distribution composed of two normal distributions with distinctly different means. This suggests that DNN was able to separate the case population into a high-genetic-risk case sub-population with an average PRS significantly higher than the control population and a normal-genetic-risk case sub-population with an average PRS similar to the control population. This allowed DNN to achieve 18.8% recall at 90% precision in the test cohort with 50% prevalence, which can be extrapolated to 65.4% recall at 20% precision in a general population with 12% prevalence. Interpretation of the DNN model identified salient variants that were assigned insignificant p-values by association studies, but were important for DNN prediction. These variants may be associated with the phenotype through non-linear relationships.Comment: 28 pages, 7 figures, 2 Table

arXiv.org e-Print Archive